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UNUSUALLY LONG HYPTIOTES (ARANEAE, ULOBORIDAE) 
SEQUENCE FOR SMALL SUBUNIT (18S) RIBOSOMAL 
RNA SUPPORTS SECONDARY STRUCTURE 
MODEL UTILITY IN SPIDERS 


J. C. Spagria 1 and R. G. Gillespie: Division of Insect Biology, Department of 
Environmental Science, Policy and Management, 137 Mulford Hall, UC Berkeley, 
Berkeley, California 94720, USA. E-mail: jspagna@berkeley.edu 

ABSTRACT. We report on the structure of the small-subunit ribosomal RNA (18S rRNA) sequence 
from Hyptiotes gertschi (Araneae, Uloboridae), which is the largest 18S gene sequenced in any arachnid 
to date. We compare this remarkable sequence to those from a range of other spiders and arachnids, and 
develop base-pairing models of its insert regions to determine its overall secondary structure. The H. 
gertschi sequence of 1902 bases is 86 nucleotides longer than any comparable spider sequence and contains 
5 inserts between 5 and 28 bases in length, all at regions characterized as among the most variable in 
eukaryotic 18S genes. Inserts were also found in one of these variable regions in published sequences of 
3 species of hard ticks (Acari, Ixodidae). Other arachnid taxa were remarkably uniform in 18S primary 
sequence length, ranging from 1802 to 1816 nucleotides. Thermodynamic modeling of the H. gertschi 
inserts suggests they are largely self-complementary, extending the stem portions of the variable regions. 

Keywords: Phylogenetics, arachnids, Acari, gene inserts 


The small subunit ribosomal RNA, or 18S 
rRNA, is one of a small set of commonly used 
sequences for molecular phylogenetic recon¬ 
struction of arthropod relationships (reviewed 
in Caterino et al. 2000). The gene coding for 
the 18S rRNA contains sections that are high¬ 
ly conserved, and these provide informative 
characters for the assessment of relationships 
between distantly related taxa, such as meta¬ 
zoan relationships (Giribet & Wheeler 2001; 
Mallatt et al. 2004). The 18S rRNA has also 
been used in studies of divergence between 
arachnid orders, as well as studies of diver¬ 
gence between spider genera and species 
(Wheeler & Hayashi 1998; Arnedo et al. 
2004). In the context of arachnid molecular 
phylogenetics, to understand both the poten¬ 
tial utility and drawbacks in the use of any 
genetic marker, it is important to have knowl¬ 
edge of the amount of variation in both the 
primary sequence and secondary structures 
across taxa at different levels. This is because 
the secondary structure can influence the rate 
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at which different parts of the primary se¬ 
quence vary. 

The genes coding for ribosomal RNAs con¬ 
tain regions that can accumulate and lose ba¬ 
ses through insertion and deletion events (in- 
dels) more easily than protein-coding genes, 
which are constrained by the requirement that 
they maintain an open reading frame for prop¬ 
er translation into a functional primary amino 
acid sequence. Indels can change the length of 
the gene and make homology assessments of 
individual base-pairs, and often long stretches 
of base-pairs, difficult. The proper methodo¬ 
logical approach to this sequence-alignment 
problem is a source of controversy in phylo¬ 
genetics, and opinions range from using static 
alignments, with regions that are difficult to 
align being either included or discarded (e.g., 
Nardi et al. 2003); to using direct optimization 
(Wheeler 1996), which avoids the arbitrary re¬ 
moval of data and possible evolutionary sig¬ 
nal and avoids the problem of multiple-se¬ 
quence alignment altogether. An additional 
feature of direct optimization, as implemented 
in the program POY (Wheeler 1999) is the 
ability to treat inserts as multistate characters 
with as many states as there are unique insert 
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sequences, with a matrix of character-state 
transformation costs* See Giribet & Wheeler 
(2001) for an example of implementation of 
the latter method* 

A very different approach, and one that has 
been promoted by many authors, is the use of 
secondary structure models to aid in homol¬ 
ogy assessment and relative weighting (Dixon 
& Hillis 1993; Kjer 2004)* Recently, it has 
been proposed that elements of rKNA second¬ 
ary structure can provide a basis for choice 
between multiple models to be used in a par¬ 
titioned Bayesian analysis (Telford et ah 
2005)* For this technique, bases from a variety 
of taxa would be compared to a secondary 
structure model for the gene for assignment to 
“stem” and “loop” (and possibly transitional 
or ambiguous) partitions. These partitions 
would be individually tested for the most ap¬ 
propriate likelihood models, then subsequent¬ 
ly analyzed using the partitioned, multiple- 
model approach. 

The RNAs produced by the 18S genes form 
secondary structure by base-pairing with their 
own complementary stretches of sequence, 
forming helical structures commonly referred 
to as stems, while the non-pairing regions 
form loop structures that connect multiple 
stems, or form the terminal turn in the self¬ 
complementary stems. Because the overall 
secondary structure is functionally important 
for the ribosome, and because two comple¬ 
mentary changes in primary sequence are re¬ 
quired to maintain a stern structure, it has been 
proposed that the stems should be treated dif¬ 
ferently in phylogenetic analysis than the less- 
constrained loops (Dixon & Hillis 1993). To 
do so requires the ability to tell whether nu¬ 
cleotides are in a loop or a stem structure, 
which can be modeled using computer algo¬ 
rithms that develop secondary structures 
based on comparative or thermodynamic in¬ 
formation (reviewed in Gardner & Giegerich 
2004). 

Here we present a comparison of 18S struc¬ 
tures sampled from arachnid orders and spider 
lineages to assess the utility of secondary 
structure information currently available for 
making homology judgments across arachnid 
taxa, and for determining data partitions used 
in model-based analyses. Included are new 
data from the species Hyptiotes gertschi 
Chamberlin & Ivie 1935 (Araneae, Ulobori- 
dae), and remarkably similar sequences from 


hard ticks (Acari, Ixodidae) that demonstrate 
the evolutionary conservation of the overall 
structure of the arachnid 18s gene. 

METHODS 

Taxon sampling.— We sampled exemplar 
sequences from all arachnid orders and major 
spider lineages for which at least one full se¬ 
quence of the entire 18S gene was available 
in Gen Bank (http://www.ncbi.nlm.nih.gov/ 
Genbank/). To sample unusually long arachnid 
18S sequences thoroughly, we extended our 
search to include all “partial sequences” that 
came within 90 bases (—5% of the gene 
length) of the primer regions, since this region 
is highly conserved across arthropods and the 
missing length could be estimated despite lack 
of primary sequence. In taxa where 18S gene 
length was uniform and in the ~ 1800 base 
pair (= bp) length typical of arachnids, one 
individual was sampled, while all complete or 
nearly-complete sequences with large (> 3 
bp) inserts relative to the Aphonopelma sp. se¬ 
quence were included in the sample, since a 
complete secondary structure model is avail¬ 
able for this taxon (Hendriks et al. 1988). 

Among spiders, lineages included represen¬ 
tatives from Mygalornorphae ( Aphonopelma 
sp Theraphosidae), Mesothelae ( Liphistius 
bicoloripes , Liphistiidae), Orbicularia (JY. 
gertschi , Uloboridae), “derived orb-weavers” 
(.sensu Coddington 1990), ( Nesticus cellulan - 
us , Nesticidae) and the RTA (retrolateral tibial 
apophysis) clade ( sensu Coddington et al. 
2004) ( Coelotes terrestris , Amaurobiidae) 
(see Table 1). All spider sequences were from 
Genbank except for the H gertschi data, for 
which new sequence data were generated in 
the current study. Specimens of H. gertschi 
were collected by S. Lew at the Angelo Re¬ 
serve, Mendocino County (39°43'N, 
123°39'W), and from Del Norte County near 
the Oregon border (41°59 , N, 123°43'W), Cal¬ 
ifornia, USA, in May 2003. Voucher speci¬ 
mens are stored in the collection of the Essig 
Museum of Entomology, UC Berkeley, under 
the voucher codes EMEC50654 and 
EMEC50993, respectively. 

Molecular methods, —DNA was extracted 
from two legs from each of the H. gertschi 
specimens using a Qiagen DNeasy Tissue 
Kit's standard protocol. The 18S gene was di¬ 
rectly PCR-amplified in three parts using 
primer pairs 1F-5R, 5F-9R, and 3F-7R (after 
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Table 1.—Taxa examined, with total gene length and p-distance between primary sequence of individ¬ 
uals and Aphonopelma sp. reference sequence. Length includes terminal primer regions (approximately 
50 base pairs (= bp); 23 per primer + 4 bp downstream in the 3' direction from primer 9R). * indicates 
partial sequences (see text), with lengths estimated by assuming uniform sequence length relative to 
reference sequence at missing terminal regions. 


Order (lineage represented) 

Taxon sampled 

Genbank 

Accession 

Number 

Total 

length 

(bp) 

Uncorrected 
p-distance to 
reference 

Araneae (Orbicularia) 

Hyptiotes gertschi Chamber¬ 
lin & Ivie 1935 

DQ015708 

1902 

10.8% 

Araneae (Mygalomorphae) 

Aphonopelma sp. 

XI3457 

1814 

— 

Araneae (Mesothelae) 

Liphistius hicoloripes Ono 

1988 

AF007104 

1808 

2.4% 

Araneae (derived orb-weavers) Nesticus cellulanus (Clerck 

1757) 

AF005447 

1816 

7.8% 

Araneae (RTA ciade) 

Coelotes terrestris (Wider 
1834) 

AJ007986 

1814 

5.7% 

Opiliones 

Odiellus troguloides (Lucas 
1847) 

X81441 

1810 

6.2% 

Scorpiones 

Androctonus australis (Lin¬ 
naeus 1758) 

X77908 

1812 

6.7% 

Pseudoscorpiones 

Roncus cf. pugnax (Navas 
1918) 

AF05443 

1808 

11.0% 

Acari (Ixodidae) 

Amhlyomma glauerti Keir- 
ans. King & Sharrad, 

1994 

AF115372 

1802 

10.3% 

Solifugae 

Eusimonia wunderlichi Pie- 
per 1977 

U29492 

1802 

7.7% 

Amblypygi 

Paraphrynus sp. 

AF005445 

1810 

4.8% 

Uropygi 

Mastigoproctus giganteus 
(Lucas 1835) 

AF005446 

1810 

4.9% 

Schizomida 

Stenochrus portoricensis 
Chamberlin 1922 

AF005444 

1809 

5.5% 

Ricinulei 

Pseudocellus pearsei (Cham¬ 
berlin & Ivie 1938) 

U91489 

1813 

5.1% 

Palpigradi 

Eukoenenia sp. 

AF207648 

1810 

7.4% 

Acari (Ixodidae) 

Sejus sp. 

AF287237 

1880* 

21.1% 

Acari (Ixodidae) 

Lohmannia sp. 

AF287234 

1887* 

12.3% 

Acari (Ixodidae) 

Alicorhagidia sp. 

AF022024 

1872* 

13.8% 


Giribet et al. 1999), to amplify fragments of 
length —950 bp, —850 bp, and —1000 bp, 
covering the first half, second half, and an 
overlapping central portion of the gene re¬ 
spectively. The PCR protocol (modified from 
Hedin & Maddison 2001) used 35 cycles of 
30 s at 95° C melting temperature, followed 
by 30 s at an annealing temperature of 52° C, 
followed by an extension step of 45 s at 72° 
C, with 3 s added to this extension for every 
cycle after the first. In addition, these cycles 
were preceded by an initial melting at 95° C 
for 2 min, with a 7 min final extension at 72° 
C. A MasterAmp PCR Optimization Kit (Ep¬ 
icentre Technologies) was used to choose ap¬ 


propriate buffers to stabilize the PCR reaction, 
which improved yield and quality of PCR 
products. Molecular extraction vouchers were 
stored at “80° C. 

Gene size, purity and concentration were 
assessed by running out a portion of the PCR 
product on a 1.5% TBE/agarose gel. PCR 
products were cleaned using Qiagen QiaQuick 
PCR Purification Kit, and cycle sequenced in 
both directions using dye terminators (after 
Sanger et. al 1977). Cycle sequencing prod¬ 
ucts were analyzed using an ABI 3730 capil¬ 
lary autosequencing machine. Individual se¬ 
quences checked against their complementary 
sequences, using Sequencher 3.1.1 (Gene 
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Codes Corporation). This program was also 
used to assemble contigs of all three overlap¬ 
ping primer regions for each H. gertschi spec¬ 
imen, to create a single sequence for each. Ad¬ 
ditional sequences (see Table 1 for accession 
numbers) were acquired from GenBank for 
comparison. 

Alignment and Comparisons.—Using 
Clustal X version 1.83 (Chenna et aL 2003), 
a static alignment of multiple sequences was 
created using the default settings (gap inser¬ 
tion cost 15, gap extension cost 6.66, transi¬ 
tion cost 0.5 times transversion cost). Treating 
the Aphonopelma sp. sequence and model as 
references, gaps in the aligned sequences were 
compared to the secondary structure model to 
detect regions where insertions and deletions 
have occurred. 

Modeling of insert regions in H. gert¬ 
schi .—In areas with insertions > 3 bp relative 
to the Aphonopelma sp. reference sequence, 
secondary structures were modeled using the 
web-interface version of the RNAfold pro¬ 
gram in the Vienna Package (http://www.tbi. 
univie.ac.at/~ivo/RNA/; Hofacker 2003). This 
program uses a dynamic-programming algo¬ 
rithm with a variety of parameters to estimate 
the secondary structure based on minimizing 
the free energy of the possible stem-loop 
structures from the primary sequence. The pa¬ 
rameters used include RNA base-pairing en¬ 
ergies plus a variety of experimentally deter¬ 
mined adjustments to this cost, including 
energy estimates for loops of various sizes and 
locations, and for single- and double-stranded 
multi-base motifs known to affect local sta¬ 
bility (Mathews et aL 1999). 

The expanded regions of the H. gertschi 
18S gene were modeled using the primary se¬ 
quence of the insert region plus the four com¬ 
plementary bases at each insertion site that 
could be homologized with the reference se¬ 
quence and structure from Aphonopelma sp. 
Because the RNAfold program had difficulty 
aligning the complementary 4-base-pair ends 
properly in some cases, we added a comple¬ 
mentary 5-nucleotide extension at each ter¬ 
minus to anchor the ends of the sequence and 
stand in for the rest of the structure, so that 
each primary sequence input took the form: 
5 ; -GGGGG- primary sequence -CCCCC-3L 
We also tested the RNAfold program's ability 
to predict the structures in the variable arms 
of the Hendriks et al. (1988) Aphonopelma sp. 


model using the same method, then calculated 
the percentage of base-pairs and loop mem¬ 
bers in the algorithm output that correctly 
matched the model. 

RESULTS 

Comparisons of 18S sequence length and 
primary sequence divergence can be seen in 
Table L Most arachnid IBS sequences are in 
the range of 1800-1810 base pairs (including 
primer regions), with the H, gertschi and tick 
sequences being the exceptions (greater by 85 
and 56-64 bases, respectively). Figure 1 
shows a schematic of the Hendriks et. al 
(1988) model of spider 18S, with regions with 
FL gertschi and tick inserts marked with black 
arrows and a hatched bar, respectively. 

Hyptiotes gertschi and tick insertions.— 
The two H, gertschi sequences are identical, 
as are sequences from an additional set of am¬ 
plifications of the Del Norte County specimen. 
However, in contrast to the marked similarity 
in length and structure across the arachnid ex¬ 
emplars from GenBank, the H. gertschi se¬ 
quences are 86 nucleotides longer than the 
next-longest spider 18S (Nesticus cellulanus ), 
and we estimate it to be 21 nucleotides longer 
than the next-longest arachnid 18S (Sejms sp., 
Acari, Ixodidae). This additional sequence is 
found in two large inserts—a 28-base exten¬ 
sion of the helical arm El0-1 (Figure 2a), and 
27-base extension of El0-2 (Fig. 2b)—and the 
rest is found as smaller inserts of 5, 6, and 13 
nucleotides in helices 6, 41, and 47 respec¬ 
tively (Figs. 3a, b, and c). Three near-com¬ 
plete tick sequences were also found to have 
large inserts in the terminal arms of branch 10 
(see Table 1). 

Models of the H. gertschi inserts based on 
free-energy minimization show that the in¬ 
serted sequences extend the helices in all of 
the cases except structure 6, where the helical 
stem is shortened and the terminal loop ex¬ 
panded. The RNAfold program also correctly 
replicated 90% (40 of 48) of the stem base- 
pairs in the Hendriks et al. (1988) model in 
the arms in which the inserts were found, 
which is comparable with the 73% base-pair¬ 
ing accuracy for this parameter set calculated 
by Mathews et ah (1999) using a variety of 
other well-characterized genes. 

DISCUSSION 

The most remarkable results from the cur¬ 
rent study were the number of sizable inser- 
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Figure 1.—General model for secondary structure of arachnid 18S ribosomal DNA. Small numbers 
refer to general stem-loop numbering system after (Nelles et al. 1984). Those features preceded by an 
“E” are specific to eukaryotic 18S sequences. Arrows and bold numbers refer to locations and size (in 
bases relative to tarantula model) of H. gertschi inserts, respectively. Hatched bar represents region with 
large inserts found in 3 tick taxa. 


tions in the H. gertschi sequences and, in par¬ 
ticular, the shared expansion of stem E10 
found in both H. gertschi and ticks. In H. gert¬ 
schi all insertions took place in known “var¬ 
iable regions” of the eukaryote IBS structure; 
helices 6, El 0-1 and -2, 41, and 47 belong to 
regions that have been characterized as vari¬ 


able across a range of taxa (Van de Peer 
1996). The RNAfold models show that much 
of the inserted H. gertschi sequence is self¬ 
complementary and these insertions function 
to extend the stem regions, rather than in¬ 
crease the loop size, except in Helix 6. This 
extension and maintenance of complementary 
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Figures 2-4.—Region of arachnid 18S gene with 
unusually large inserts. 2. Basic structure for helices 
10, El0-1 and El0-2 in arachnids. Primary se¬ 
quence and model after Hendriks et al. (1988); 3. 
Expanded El0-2 stem-loop structure in H. gertschi ; 
4. Expanded El0-1 stem loop in H. gertschi . Boxes 
represent areas of homology where expanded arms 
attach to conserved, adjacent parts of the structure. 



stems has remarkable parallels with the find¬ 
ings of Hancock & Vogler (2000), who re¬ 
vealed a similar pattern of complementary 
stem expansion in the evolution of hypervari¬ 
able regions of 18S in tiger beetles. 

A variety of factors suggest the H . gertschi 
sequence represents the functional 18S rRNA 
gene rather than a pseudogene copy or exper¬ 
imental artifact. The presence of insertions in 
areas amplified by three different primer pairs, 
the replication of the sequence in individuals 
from two populations, the maintenance (and, 
in some cases, increase in length) of the base¬ 
pairing in stem structures, and the non-ran¬ 
dom, self-complementary sequence found in 
both large and small inserts, make it likely 
that we have sequenced a functional gene and 
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Figures 5-7.—Smaller inserts and secondary 
structure model comparisons. 5. Structure 6 with 5 
base pair insertion in H. gertschi ; 6. Structure 41 
with 6 inserted bases in H. gertschi ; 7. Structure 47 
with 13 inserted bases in H . gertschi. 


not a pseudogene or an experimental artifact. 
One would expect to see decay of pseudogene 
sequence relative to the selectively-con¬ 
strained functional gene (Giribet & Wheeler 
2001), and this decay (in the form of random 
mutations) should be equally distributed 
throughout the sequence rather than restricted 
to known variable regions. Though some 
metazoans, such as helminths, are known to 
carry two functional copies of the 18S gene 
with differing sequences (Carranza et al. 
1996), we found only single bands in our aga¬ 
rose gel visualizations of all three PCR prod¬ 
ucts, and no second allele or double-peak pat¬ 
terns were seen in the raw sequence data, 
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despite repeating the amplification and se¬ 
quencing processes, suggesting the existence 
of a single, uniform 18s sequence in H . gert- 
schi. 

How exceptional is the H. gertschi 18S 
rRNA? It seems to be an anomaly for spiders, 
which, based on those sampled in the current 
study, generally vary little even between the 
most distantly related lineages. Along with the 
tick data, these anomalous sequences show 
that the presence of large inserts in arachnids, 
though rare, is consistent with a general eu¬ 
karyotic 18S evolutionary tendency—overall 
structure is maintained, while rare inserts ap¬ 
pear in known variable areas. It is worth not¬ 
ing that only the //. gertschi gene contains 
expansions of several variable regions while 
the tick inserts are restricted to the E10 region. 

With only two populations of H. gertschi 
sampled in the current study, we cannot es¬ 
tablish the phylogenetic distribution of the 
18S rDNA extension, whether it occurs in oth¬ 
er Hyptiotes species, other uloborid genera, or 
their sister group, the Deinopidae. Such ex¬ 
tensions have not been described from other 
Orbicularia, the sister-group of the deinopoid 
(Uloboridae + Deinopidae) clade, but may be 
more widespread within the deinopoids. The 
large number of inserts found in H. gertschi 
in this study suggests that there might be var¬ 
iation in number and size of inserts found in 
different taxa, since it seems unlikely that all 
these inserts appeared simultaneously. It 
would be interesting to map any such varia¬ 
tion onto the existing phylogeny of Deinopoid 
genera (Coddington 1986) to determine the 
pattern of evolution of the 18S gene in greater 
detail. Alternatively, if variation seems to be 
limited to a subset of Hyptiotes or a small 
number of uloborid genera, the inserts them¬ 
selves could be evaluated as phylogenetic 
characters within these groups. 

Once the taxonomic range and amount of 
variation of these inserts has been character¬ 
ized, hypotheses about causal biological, 
physiological or ecological factors allowing 
such inserts to occur can be tested rigorously. 
Comparative studies could also be performed 
with the wide range of metazoan taxa which 
show similar inserts, such as myriapods, pro- 
turans, and helminth worms, to find general 
correlations between 18S size and phenotypes 
(Carranza et al. 1996; Giribet & Wheeler 
2001 ). 


While the RNA folding algorithm used may 
not predict intra-molecular folding dynamics 
of rRNA accurately, it provides a repeatable 
method for producing a plausible structure for 
sequence data which lacks obvious homologs 
from related taxa for comparison. The algo¬ 
rithm also gives a reasonable basis forjudging 
whether an insert extends a helix, a loop, or 
both, particularly with short sequences such as 
those modeled in this study. With changes af¬ 
fecting sequence length concentrated in the 
terminal ends of known stem-loop structures, 
the H. gertschi and tick 18S genes appear to 
be the exceptions that prove the rale; bases 
come and go, albeit very rarely, but the helical 
backbone and location of stem and loop struc¬ 
tures of the small subunit ribosomal RNA 
have remained conserved throughout arachnid 
evolution. 

Because the expanded sequence has only 
been found in a single spider species, it is im¬ 
possible to argue that this is evidence of any 
trend in 18S evolution in spiders. However, 
the increase in size of the 18S gene seen here 
differs from the documented trend (relative to 
other metazoans) toward reduction in size in 
the hypervariable region of the mitochondrial 
16S gene (Smith & Bond 2003) and in the 
arms of some tRNA genes (Masta & Boore 
2004) of spiders. It should be noted that the 
cited trends are in mitochondrial genes, 
whereas 18S is part of the nuclear genome, 
and the two genomes may be subject to dif¬ 
ferent selective pressures or constraints. 

To understand the broader pattern of 18S 
size change, it is useful to look at the evolu¬ 
tion of the 18S gene in other metazoan taxa. 
Giribet & Wheeler (2001) showed that the 
general pattern of 18S evolution across a 
much broader sampling of taxa, including 
hexapods, chelicerates, myriapods, and crus¬ 
taceans, is one of conserved length in the 
1800 bp range, with occasional increases in 
size. Deletion events appear to be exceedingly 
rare. The data presented here are in keeping 
with those findings, although the total length 
for the H. gertschi 18S gene is greater than 
that of any of the 49 chelicerate taxa reported 
by Giribet & Wheeler (2001) or of any full 
arachnid 18S sequence currently found in the 
Genbank database. Complete or nearly-com- 
plete sequences are required for secondary 
structure modeling since there is long-range 
complementary base-pairing in 18S secondary 
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structure (Telford et al. 2005). Unfortunately 
for the pursuit of whole-gene comparisons and 
structural modeling, many arachnid studies 
have used only half of the 18S gene, (e.g., 
Wheeler & Hayashi 1998; Arnedo et al. 
2004), and there is a sizable and important 
spider lineage, the haplogynae, for which no 
complete 18S sequence is available. 

As for the overall utility of 18S data for 
arachnid phylogeny, this data set shows that 
the Hendriks et al. (1988) model is sufficient 
for locating structural changes in the 18S gene 
for all known arachnid genotypes, though the 
value of modeling based solely on primary se¬ 
quence using thermodynamic predictions on a 
single taxon is limited. In the majority of cas¬ 
es, alignment is trivial with a mean of less 
than 1 % sequence length divergence between 
most taxa. This similarity in sequence-length 
across Araneae is also corroborated by data 
from a family-level study of the RTA clade by 
the authors (unpublished data). For data par¬ 
titioning, relative-weighting, and model- 
choice purposes, the number and locations of 
the stern-loop structures of the 18S gene re¬ 
main largely identical to that of most se¬ 
quenced eukaryotes. Determination of stem 
and loop regions should be achievable with 
some confidence for the large majority of 
arachnid cases, using the Hendriks et al. 
(1988) model as a guide. 

In exceptional situations where homology 
assessments or stem-loop status of individual 
bases are difficult, such as within the tick-spe¬ 
cific inserts seen here, removing the insert 
data may be defensible, for two reasons: first, 
the tendency of insertions to occur indepen¬ 
dently in the same areas across widely diver¬ 
gent taxa (such as spiders and ticks having 
inserts in the E10 region) could plausibly add 
homoplasy to a data matrix and place taxa in 
incorrect groupings. Second, because the in¬ 
sertions tend to be small relative to the more 
easily homologized portions of the rRNA 
(much smaller, for instance, than the > 200 
bp insertions found in many myriapods, see 
Giribet & Wheeler 2001), and only occur in a 
small number of taxa, there is likely to be lit¬ 
tle reduction in phylogenetic signal from the 
sequence if an insert region is excluded from 
analysis. 
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